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ABSTRACT 

We consider compressed sensing of block-sparse signals, i.e., 
sparse signals that have nonzero coefficients occuring in clusters. 
Based on an uncertainty relation for block-sparse signals, we define 
a block-coherence measure and we show that a block-version of the 
orthogonal matching pursuit algorithm recovers block fc-sparse sig- 
nals in no more than k steps if the block-coherence is sufficiently 
small. The same condition on block-sparsity is shown to guarantee 
successful recovery through a mixed I2 /li optimization approach. 
The significance of the results lies in the fact that making explicit 
use of block-sparsity can yield better reconstruction properties than 
treating the signal as being sparse in the conventional sense thereby 
ignoring the additional structure in the problem. 

Index Terms — block sparsity, coherence, uncertainty relations 

1. INTRODUCTION 

We consider compressed sensing 1 1 , 2| of sparse signals that exhibit 
additional structure in the form of the nonzero coefficients occuring 
in clusters. It is therefore natural to ask whether explicitly taking this 
block sparse structure into account yields improvements over treat- 
ing the signal as a conventional sparse signal. It was shown in l3l[4l 
that the answer is in the affirmative. Moreover, in |3| the restricted 
amplification property was shown to provide a sufficient condition 
for robust recovery of model-compressible (which includes block- 
sparse) signals. It is furthermore shown in |3| that simple modifica- 
tions of the CoSaMP algorithm 1 5 1 and of iterative hard threshold- 
ing 1 6 1 yield reconstruction algorithms for the model-based case (in- 
cluding block-sparsity) that exhibit provable robustness properties. 
A mixed ^2/^1 -norm algorithm for recovering block-sparse signals 
was introduced in (4|. The block restricted isometry property defined 
in (4] provides equivalence conditions for guaranteeing recovery of 
block-sparse signals. 

The focus of the present paper is on the notion of coherence for 
block-sparse signals, i.e., block-coherence, and can be seen as ex- 
tending the program laid out in 1 7 , 8 1 to the block-sparse case. We 
introduce a block version of the orthogonal matching pursuit algo- 
rithm (BOMP) and find a sufficient condition on block-coherence 
to guarantee recovery of block fc-sparse signals through BOMP in 
no more than k steps. The same condition on block-coherence is 
shown to guarantee successful recovery through the mixed £2 /£i op- 
timization approach, described in |4, 9|. These results are akin to a 
sufficient condition on conventional coherence reported in |7| that 
guarantees recovery through OMP or ^1 -optimization. Finally, we 
establish an uncertainty relation for block-sparse signals and show 
how the block-coherence measure defined previously occurs natu- 
rally in this uncertainty relation. 

Notation. Throughout the paper, we denote vectors in by 
boldface lowercase letters, e.g., x, and matrices by boldface upper- 



case letters, e.g., A. The identity matrix is written as I or 1^ when 
the dimension is not clear from the context. Given a matrix A, A"^ 
and A^ are its transpose and conjugate transpose, respectively, A^ 
is the pseudo inverse, 7?.(A) denotes its range space, A^.j is the el- 
ement in the ith row and jth column, and at denotes its £th column. 
The £th element of a vector x is denoted by xe- The standard Eu- 
clidean norm is ||x|j2 = V x^x, ||x||i = \xe\ is the ^i-norm, 
||x||oo = max£ \xt\ is the ^oo-norm, and ||x|jo designates the num- 
ber of nonzero entries in x. The Kronecker product of the matrices 
A and B is written as A (8) B. The spectral radius of A is denoted 
as /9(A) — Amax(A^A), where Amax(B) is the largest eigenvalue 
of the positive-semidefinite matrix B. 

2. BLOCK-SPARSITY 

Block-sparsity. We consider the problem of representing a vector 
y G in a given dictionary D of size L x N with L < N,so that 

y = Dx (1) 

for a coefficient vector x G C^. We require x to be block-sparse, 
where, throughout the paper, blocks are always assumed to be of 
length d. To define block-sparsity, we view x as a concatenation of 
blocks (of length d) with x[£] denoting the ^th sub-block, i.e., 

= [Xi ... Xd Xd+1 ■ ■ ■ X2d ■ ■ ■ XM-d+1 ■ ■ ■ Xn]^ (2) 

=<;[il x[2] x[j\/] 

with A'^ = Md. We furthermore assume that L — Rd with R inte- 
ger. A vector x G is called block fc-sparse if x[£] has nonzero 
Euclidean norm for at most fc indices £. When d = 1, block-sparsity 
reduces to the conventional definition of sparsity as in llj|2J. Denot- 
ing 

M 

||x||2,o = ;^/(||xM||2>0) (3) 

where /(||x[^]||2 > 0) = 1 if ||x[^]||2 > and otherwise, a block 
fc-sparse vector x is defined as a vector that satisfies ||x||2,o < k. In 
the remainder of the paper conventional sparsity will be referred to 
simply as sparsity, in contrast to block-sparsity. 

Problem statement. Our goal is to provide conditions on the 
dictionary D ensuring that the block-sparse vector x can be recon- 
structed from measurements of the form ^ through computationally 
efficient algorithms. Our approach is largely based on [7, J_0| (and 
the mathematical techniques used therein) where equivalent results 
are provided for the sparse case. The results in (7) 1101 are stated 
in terms of the dictionary coherence. Therefore, as a first step in 
our development, we extend this conventional coherence measure 
to block-sparsity by defining block-coherence. Before introducing 
the corresponding definition, we cite the following proposition taken 
from 14|. 



Proposition 1. The representation ([7J is unique if and only ifT)g 7^ 
Ofor every g 7^ that is block 2k-sparse. 

Similarly to we can represent D as a concatenation of 
column-blocks D[^] of size L x d: 

D = [di ... dd dd+i ...d2d ... djv_d+i ... djv]. (4) 

D[l] D[2] U[M] 

Since from Proposition [T] the columns of T)[l], V£, are linearly in- 
dependent, we may write D[^] — A[^]W£ where A[£] consists of 
orthonormal columns that span 7?.(D[^]) and is invertible. De- 
noting by A the L x N matrix with blocks A[£], and by W the 
N X N block-diagonal matrix with blocks W|, we conclude that 
D — AW. Since W is block-diagonal and invertible, c — Wx is 
block-sparse with the same block-sparsity level as x. Therefore, in 
the sequel, we assume, without loss of generality, that D consists of 
orthonormal blocks, i.e., D^[^]D[£] — Id. Throughout the paper, 
we furthermore assume that the dictionaries we consider satisfy the 
condition of Proposition[T] 

Block-coherence. We define the block-coherence of D as 

= maxip(M[^,r]) with M[^, r] = D^[^]D[r]. (5) 

i,r^e d 

Note that M[£, r] is the ^rth d x d block of the TV x TV matrix M = 
D^D. When d = 1, ps reduces to the conventional definition of 
coherence UnfTol lTl 



u = max |d?dr. 



(6) 



It is easy to see that the definition in Jsj is invariant to the choice of 
orthonormal basis D[^] for 7?.(D[^]). This is because p(M[^, r]) = 
p{\jfM[£, r]Ur). In the remainder of the paper conventional co- 
herence will be referred to simply as coherence, in contrast to block- 
coherence. 

Proposition 2. The block-coherence /ib satisfies < /in < 1. 

Proof. Clearly > 0. To prove that /ib < 1, note that p(A) < 
||A||, where ||A|| is any matrix norm. In particular, if A is a d x d 
matrix, then 



p(A) < max I Ai J | < dma,x | Ai,. 



(7) 



In our case, A = M[£, r]. Since the columns of D are normalized, 
all the elements of M[£, r] have absolute value smaller than or equal 
to 1, so that from 0, p{M.[l, r]) < d, and hence /iB < 1. □ 

It is interesting to compare fiB with the coherence fi defined in 
^ for the same dictionary D. 

Proposition 3. For any dictionary D, we have /ib < /i. 
The proof follows immediately from 

3. UNCERTAINTY RELATION FOR BLOCK-SPARSITY 

We next show how the block-coherence ^b defined above naturally 
appears in an uncertainty relation for block-sparse signals. This un- 
certainty relation generalizes the corresponding result for the sparse 
case reported in [ 10 |. 

The uncertainty principle for the sparse case is concerned with 
pairs of representations of a vector x G in two different or- 
thonormal bases for C^: {</)^ , 1 < ^ < TV} and {i/j^., 1 < ^ < TV} 



IIIIIIOI . Any vector x G C can be expanded uniquely in terms of 
each one of these bases according to: 



<=i 1=1 



(8) 



The uncertainty relation sets limits on the sparsity of the decompo- 

||a||o and 



B — ||b||(), it is shown in |10| that 
1 



Specifically, denoting A 
1 



sitions ([Sj for any x G 
3wn 

{A + B)> VAB > 



where ^) is the coherence between $ and defined by 

£.r 



(9) 



(10) 



In 1 11 I it is shown that 1/VTV < ^t(*, *) < 1. We now 
develop an uncertainty principle for block-sparse decompositions, 
analogous to Specifically, we find a result that is equivalent to 
([9} with A and B replaced by block-sparsity levels as defined in l[3j 
and ^) replaced by the block-coherence between the orthonor- 
mal bases considered, as defined in ( I13t . 

Theorem 1. 1112]I Let be two unitary matrices with L x d 

blocks {^[£],'^[£],1 < £ < M} and let G satisfy 



x = ^*MaM = ^*Mb[. 



Let A 



where 



||a||2,o and B 
1 
2 



b||2,o. Then, 
{A + B)> VAB > 

1 



d^iB(*,*) 



^b(*,*) =max--p(*'^[^]*[' 

i.r d 



(11) 



(12) 



(13) 



It can easily be shown that for D consisting of the orthonormal 
bases $ and i.e., D = we have I-Ib{^, ^) = Mb, where 

/iB is as defined in (|5} and associated with D = 'S']. 

The bound provided by Theorem [T] can be tighter than that ob- 
tained by applying the conventional uncertainty relation (|9} to the 
block-sparse case. This can be seen by using ||a||o < d||a||2,o, 
||b||o < d||b||2,o, and ([9} to obtain 



1 

> — . 

djj, 



(14) 



\/||a||2,o||b||2,o 
Since hb < /i, this bound can be looser than \121 . 

3.1. Block-incoherent dictionaries 



As already noted, in the sparse case (i.e., d = 1) for any two or- 
thonormal bases $ and we have fi > 1/%/TV. We next show 
that the b lock-coherence satisfies a similar inequality, namely fiB > 
l/V dTV. Evidently, the lower bound on fi is \/d times larger than 
that on /iB. To prove the lower bound on hb, let $ and ^' denote 
two orthonormal bases for and let A — where A[£, r] 

stands for the {£, r)\h d x d block of A. With M — N/d, we have 

M M 

mVI > ^^^A„,ax(A^[^,r]A[£,r]) 

1^1 r = l 

/MM \ 
\f=l r=l / 



Now, it holds that 



4.2. Recovery conditions 



M M M / M \ 

f = l r = \ r = \ j 

(16) 

Since $ consists of orthonormal columns, X^f^M^^M = 
4>$^ = Furthermore, since ^[r] consists of orthonormal 

columns, Vr, we have *^[r]*[r-] = Id, Vr. Therefore, ([15) be- 
comes 

- A?d? = d]V ^^'^ 

which concludes the proof. 

We now construct a pair of bases that achieves the lower bound 
on /IB and therefore has the smallest possible block-coherence. 
Let F be the DFT matrix of size M = Njd with Ff,^ = 
exp{j2nlr/M). Define * = Ijv and 



(18) 



where Ud is an arbitrary d x d unitary matrix. For this choice, 
^"[l]^[r] = Ff,^Ud. Since p(Ud) = 1 and [F^.^] = we 
get 

MB = — U = ^i=. (19) 



When d — 1, this basis pair reduces to the spike-Fourier pair which 
is well known to be maximally incoherent 1 11 1. 

4. EFFICIENT RECOVERY ALGORITHMS 

We now give operational meaning to block-coherence by showing 
that if it is small enough, a block-sparse signal x can be recovered 
from y = Dx using computationally efficient algorithms. We con- 
sider two different algorithms, namely the mixed £2/^1 optimization 
program proposed in l4l : 



min\ ||x[^]||2 s.t. y = Dx 



(20) 



and an extension of the orthogonal matching pursuit (OMP) algo- 
rithm 1 13 1 to the block-sparse case described below and termed 
BOMP. We then show that both methods recover the correct block- 
sparse X as long as /ib associated with D is small enough. 

4.1. Block OMP 

The BOMP algorithm is similar in spirit to the conventional OMP 
algorithm, and can serve as a computationally attractive alternative 
toll20t. 

The algorithm begins by initializing the residual as ro ~ y. At 
the ith stage (£ > 1) we choose the subspace that is best matched to 
re-i according to: 



jf=argmax||D [j]r£_i|j2. 



(21) 



Once the index it is chosen, we find the optimal coefficients by com- 
puting Xf [i] as the solution to 



y-^D[i]x,[i] 



(22) 



Here I is the set of chosen indices < j < £. The residual is 
then updated as 

=y-^DWx4i]- (23) 



Our main result, summarized in Theorem [3] below, is that any block 
fc-sparse vector x can be recovered from measurements y — Dx 
using either the BOMP algorithm or l |2Qt if the block-coherence 
satisfies kd < {fi^^ + d)/2. If x was treated as a (conventional) 
fcd-sparse vector without exploiting knowledge of the block-sparse 
structure, a sufficient condition for perfect recovery using OMP or 
l l20t for d = 1 (a.k.a. basis pursuit) is kd < [fi^^ + l)/2. Since 
^ > /IB, exploiting the block structure by using BOMP or l |20t re- 
covery is guaranteed for a potentially higher sparsity level. 

To state our results, suppose that xo is a length- A'^ block fc-sparse 
vector, and let y — Dxo where D consists of blocks D[^] with 
orthonormal columns. Let Do denote the L x (kd) matrix whose 
blocks correspond to the non-zero blocks of xo, and let Do be the 
matrix of size L x (A^ — kd) which contains the columns of D not 
in Do. We then have the following theorem proved in Section[5] 

Theorem 2. Let xo G be a block k-sparse vector with blocks of 
length d, and let y = Dxo/or a given L x N matrix D. A sufficient 
condition for the output of the BOMP and of\20^ to equal xo is that 



pc(d;do) < 1 



where 



p,(A) = max^p(A[r,^]) 

r 

and A[r, I] is the {r, £)th d X d block of A. 
Note that 

Pc (dJ Do ) = max pe (D JDq [£] ] 
Therefore, i24l implies that for all £, 

Pe(DtDoM) < 1. 



(24) 



(25) 



(26) 



(27) 



The sufficient condition i24i depends on Do and hence on the 
location of the nonzero blocks in xo, which, of course, is not known 
in advance. Nonetheless, as the following theorem shows, i24l holds 
whenever the dictionary D has low block-coherence. 

Theorem 3. [12] Let /ib be the block-coherence defined by Q- 
Then i24i is satisfied if 



kd < ^il^B^+d). 
For d = 1, we recover the results of jTHH). 



(28) 



5. PROOF OF THEOREM 2 



We start with some definitions. For x G 
mixed £2 /£p norm: 



\m\2.p 



where ve 



, we define the general 



(29) 



and the x[^] are consecutive length-d blocks. For an L x A'^ matrix 
A with L — Rd and = Aid, where R and Al are integers, we 
define the mixed matrix norm (with block size d) as 



lAxI 



l|2,p 



2,P 



X 2,; 



(30) 



The following lemma provides bounds on the mixed matrix 
norms for p = 1, 00, which we will use in the sequel. 



Lemma 1. [12'] Let A be an L x N matrix with L = Rd and 
N = Md. Denote by A[e, r] the {£, r)th d x d block of A. Then, 



< maxVp(A[r,^])^p,(A) 

r ^ ^ 



:^p(A[r 



l=/9c(A) 



(31) 



(32) 



In particular, Pr{A) — pc{A^). 
5.1. Block OMP 

We begin by proving that i24\ is sufficient to ensure recovery using 
the BOMP algorithm. 

To prove the result, we first show that if r^-i is in 7?.(Do), then 
the next chosen index ie will be correct, namely it will correspond 
to a block in Dq. Assuming that this is true, it follows immediately 
that ii is correct since clearly ro = y lies in TZ(Do)- Noting that ri 
lies in the space spanned by y and Do[i], i G Je, where Ii denotes 
the indices chosen up to stage it follows that if Ii corresponds to 
correct indices, i.e., D[i] is a block of Do for all i £le, then re also 
lies in TZ{T)o) and the next index will be correct as well. Thus, at 
every step a correct subset is selected. It is also clear that no index 
will be chosen twice since the new residual is orthogonal to all the 
previously chosen subspaces; consequently the correct xo will be 
recovered in k steps. 

It therefore remains to show that if r^_i G 7?.(Do), then under 
i24\ the next chosen index corresponds to a block in Dq. This is 
equivalent to requiring that 



ziri-i = 



Do re- 



D^r£_i||2,oo 



< 1. 



(33) 



From the properties of the pseudo-inverse, TZ{T)o) ~ 7?.(DoDq) 
and consequently DoDqT^-.! = rf_i. Since DqDq is Hermitian, 



Substituting ([34} into J33t yields z{re-i) — 



(34) 



||Dff rf_i||2,oo 
where we used Lemma[T] This completes the proof 



(35) 



5.2. £2/^1 Optimization 

We now show that l l24t is also sufficient to ensure recovery using 
l l20t . To this end we rely on the following lemma: 

Lemma 2. [12 J Suppose that v is a length N — Aid vector with 
||v[i?]||2 > 0, VZ, and that A is a matrix of size L X N, where L — 
Rd and the blocks A[(.,r] are of size d X d. Then, ||Av||2.i < 
Pc(A)||v||2,i. If in addition the values of pc{A3 e) are not all equal, 
then the inequality is strict. Here, Jg is an N x d matrix that is all 
zero except for the Ith d x d block which equals 1^. 

To prove that ( I20t recovers the correct vector xo, let x' be an- 
other set of coefficients for which y = Dx'. Denote by cq and c' 
the length kd vectors consisting of the non-zero elements of xo and 
x', respectively. Let Do and D' denote the corresponding columns 
of D so that y = Doco = D'c'. From the assumption in Propo- 
sition[T] it follows that there cannot be two different representations 
using the same blocks Do. Therefore, D' must contain at least one 



block, Z, that is not included in Do. From l l27t . Pc(DqZ) < 1. For 
any other block U in D, we must have that 

p,(DtU) < 1. (36) 

Indeed, if U G Do, then U = Do[^] = DoJf where 3i is a matrix 
with d columns which is all zero, except for the l\h block which is 
equal to I^. In this case, DgDo[^] = and hence pc(DQDo[i]) = 
Pc(DJU) = 1. If, on the other hand, U = D[^] for some £, then it 
follows from ^ that pc(D|!,U) < 1. 

Now, suppose first that the blocks in DgD' do not all have the 
same spectral radius p. Then, 

||co||2,i = ||DtDoco||2,i = ||Djy||2,i = ||dJd'c'||2,i 

< p,(DjD')|lc'|l2.i < ||c'||2.i (37) 

where the first equality stems from the fact that the columns of 
Do are linearly independent (a consequence of the assumption in 
Proposition [T}, the first inequality follows from Lemma [2] since 
||c'[.^]||2 > 0, Ml, and the last inequality follows from (|36}. If all 
the blocks of DJD' have identical spectral radius p, then p < 1 as 
for Z G D', Pc(DqZ) < 1. Repeating the calculations in J37b . we 
find that the first inequality is no longer strict. However, the second 
inequality in ([37} is strict instead so that the conclusion still holds. 

Since ||xo||2,i = ||co||2,i and ||x'||2,i = ||c' ||2,i, we conclude 
that under ( 127b . any set of coefficients used to represent the original 
signal that is not equal to xo will result in a larger I2 /li norm. 
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